Home

Column

Chart Toppers

What the ARIA charts and Hottest 100 say about Australia’s changing music tastes

Music charts around the world give insights into the ever changing music tastes of the population which they represent. In Australia, the Australian Recording Industry Association (ARIA) have been publishing weekly rankings of tracks and albums selling in the Australian market under the name of the ARIA Charts since 1970. Triple J, a national Australian radio station intended to appeal to listeners of alternative music, also host an annual ranking event titled the Triple J Hottest 100, a vote-driven chart of the 100 most popular songs amongst its listeners.

Our team aims to investigate how Australian music tastes have changed over time by comparing the audio features of tracks from the yearly ARIA and Triple-J Hottest 100 charts from 1970-2019. We will be using the Spotify API to map the tracks from the charts to its corresponding Spotify URI, allowing us to analyse individual tracks against key attributes: duration, key, mode, time signature, acousticness, speechiness, danceability, energy, loudness, valence/happiness and tempo.

The documentation for the endpoint we are using to get the analysis data is available here. We are sourcing the ARIA charts and Triple J Hottest 100 rankings from these sites:

This dashboard uses Plotly to allow you to interact with our graphs. Use the tools when hovering over charts to select and filter attributes or focus in on time ranges!

The Team

Our Discove-R-Weekly team as part of ETC1010:

  • Alvin Lieu
  • Angela Oon
  • Hannah Shiau
  • Kalana Vithana
  • Macey Jackson
  • Poornisha Senthilkumar
  • Subhashini Singhal

The Data

Column

Data Sourcing

The data was sourced through calling the above endpoint of the Spotify API. This was done by using their Javascript client libraries since it was what we were most familiar with.

The source code for pulling data from Spotify is available on Github here.

We created or sourced a Spotify playlists which included the songs for each year’s charts from 1970 - 2019. This playlist could then be passed to the CLI tool.

Given a list of playlist URIs or URLs in a .txt file, the program will get the tracks associated with that playlist, and call the Spotify Audio Features endpoint to get insightful information about the tracks.

This data was outputted in csv format into the following structure:

└───data
    │   playlists.csv
    │
    ├───features
    │       0hIiy3ihpzsIX9Dd6RVtWw.csv
    │       0kgHtoYJSMS3pMMciC3Us4.csv
    │       ...
    │
    ├───artists
    │       0hIiy3ihpzsIX9Dd6RVtWw.csv
    │       0kgHtoYJSMS3pMMciC3Us4.csv
    │       ...
    │
    └───tracks
            0hIiy3ihpzsIX9Dd6RVtWw.csv
            0kgHtoYJSMS3pMMciC3Us4.csv
            ...
  • playlists.csv : a lookup of the playlist URI and the playlist name and description
  • /features: folder with an individual csv for each playlist, containing the audio features data.
  • /artists: folder with an individual csv for each playlist, containing data about each artist (can be multiple for each track).
  • /tracks: folder with an individual csv for each playlist, containing data about each track.

playlists.csv

Has consolidated information about each playlist.

Primary Key:

  • playlist_uri: Spotify URI for the playlist. Also the title of corresponding CSV files.

Other Attributes:

  • title: Title of the playlist
  • description: Description of the playlist
  • url: URL of the playlist

After cleaning & consolidation:

  • year: Year of the chart
  • chart: H100, ARIA or OTHER

features/...

Has audio features for each track returned from the Spotify Audio Features endpoint.

Primary Keys: - playlist_uri: Spotify URI for the playlist - uri: Spotify URI of the track

Other Attributes: The following features are saved in the CSV. Read the description and distribution for each attribute from the above Spotify API documentation.

  • duration_ms
  • key
  • mode
  • time_signature
  • acousticness
  • danceability
  • energy
  • instrumentalness
  • liveness
  • loudness
  • speechiness
  • valence
  • tempo

tracks/...

Idenitifying information about each track, as returned from the Spotify Get Tracks endpoint.

Note: Tracks can have multiple artists. The CSV has been formatted to have an entry for each listed artist of the track.

Primary Keys:

  • playlist_uri: Spotify URI for the playlist
  • uri: Spotify URI of the track
  • artist: name of a featuring artist
  • artist_uri: Spotify URI of the artist

Other Attributes:

The following features are saved in the CSV. Read the description and distribution for each attribute from the above Spotify API documentation.

  • album: name of the album
  • album_uri: Spotify URI for the album
  • disc_number
  • duration_ms
  • name
  • popularity
  • explicit
  • uri
  • link: renamed from href

artists/...

Idenitifying information about each artist, as returned from the Spotify Get Artists endpoint.

Note: Genres for each artist are comma separated and should be expanded

Primary Keys:

  • uri: Spotify URI of the artist

Other Attributes:

The following features are saved in the CSV. Read the description and distribution for each attribute from the above Spotify API documentation.

  • name: name of the artist
  • followers: Number of followers
  • popularity
  • uri
  • genres: comma separated string of genres
  • link: renamed from href
Data Cleaning & Consolidation

After the above process, we had to get the data into R and create our dataframes. Since the data coming from the Spotify API is reliable and clean, most of the work involves merging together the data from the separate csvs into a single dataframe. Our process is outlined below.

Import CSV

First, let’s get the playlists.csv file which has information about all the playlists included in the dataset.

playlist_data <- read_csv(here::here('Data/playlists.csv'))

features_data <-
    list.files(path = here::here('Data/features'),
               pattern = "*.csv",
               full.names = T) %>%
    map_df(~read_csv(.))

tracks_data <-
    list.files(path = here::here('Data/tracks'),
               pattern = "*.csv",
               full.names = T) %>%
    map_df(~read_csv(.))

artists_data <-
    list.files(path = here::here('Data/artists'),
               pattern = "*.csv",
               full.names = T) %>%
    map_df(~read_csv(.))

glimpse(playlist_data)
Rows: 77
Columns: 4
$ playlist_uri <chr> "spotify:playlist:6YKI2VYSO9iZtyaLb3ZUsG", "spotify:play…
$ title        <chr> "ARIA Top 100 Singles of 1970", "ARIA Top 100 Singles of…
$ description  <chr> "Unavailable: Lionel Rose - I Thank You, Mary Hopkin - K…
$ url          <chr> "https://api.spotify.com/v1/playlists/6YKI2VYSO9iZtyaLb3…
glimpse(features_data)
Rows: 7,605
Columns: 15
$ playlist_uri     <chr> "spotify:playlist:03vMytYzGsKy1dQkGDdiCp", "spotify:…
$ duration_ms      <dbl> 269667, 180566, 229526, 241693, 295502, 176561, 2535…
$ key              <dbl> 0, 4, 10, 4, 5, 7, 5, 1, 5, 2, 1, 9, 1, 9, 2, 0, 4, …
$ mode             <dbl> 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0…
$ time_signature   <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4…
$ acousticness     <dbl> 0.00801, 0.16600, 0.36900, 0.63400, 0.32900, 0.00346…
$ danceability     <dbl> 0.856, 0.782, 0.689, 0.566, 0.470, 0.723, 0.489, 0.5…
$ energy           <dbl> 0.609, 0.685, 0.481, 0.664, 0.431, 0.809, 0.597, 0.8…
$ instrumentalness <dbl> 8.15e-05, 1.18e-05, 1.03e-06, 0.00e+00, 0.00e+00, 1.…
$ liveness         <dbl> 0.0344, 0.1600, 0.0649, 0.1160, 0.0854, 0.5650, 0.10…
$ loudness         <dbl> -7.223, -6.237, -7.503, -5.303, -6.129, -3.081, -6.6…
$ speechiness      <dbl> 0.0824, 0.0309, 0.0815, 0.0464, 0.0342, 0.0625, 0.02…
$ valence          <dbl> 0.928, 0.603, 0.283, 0.437, 0.289, 0.274, 0.324, 0.6…
$ tempo            <dbl> 114.988, 118.016, 80.025, 128.945, 157.980, 98.007, …
$ uri              <chr> "spotify:track:32OlwWuMpZ6b0aN2RZOeMS", "spotify:tra…
glimpse(tracks_data)
Rows: 8,893
Columns: 12
$ playlist_uri <chr> "spotify:playlist:03vMytYzGsKy1dQkGDdiCp", "spotify:play…
$ album        <chr> "Uptown Special", "Uptown Special", "Me 4 U", "Me 4 U", …
$ album_uri    <chr> "spotify:album:3vLaOYCNCzngDf8QdBg2V1", "spotify:album:3…
$ artist       <chr> "Mark Ronson", "Bruno Mars", "OMI", "Felix Jaehn", "Wiz …
$ artist_uri   <chr> "spotify:artist:3hv9jJF3adDNsBSIQDqcjp", "spotify:artist…
$ disc_number  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ duration_ms  <dbl> 269666, 269666, 180565, 180565, 229525, 229525, 241693, …
$ name         <chr> "Uptown Funk (feat. Bruno Mars)", "Uptown Funk (feat. Br…
$ popularity   <dbl> 81, 81, 76, 76, 82, 82, 74, 73, 60, 60, 60, 71, 80, 77, …
$ explicit     <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, …
$ uri          <chr> "spotify:track:32OlwWuMpZ6b0aN2RZOeMS", "spotify:track:3…
$ link         <chr> "https://api.spotify.com/v1/tracks/32OlwWuMpZ6b0aN2RZOeM…
glimpse(artists_data)
Rows: 8,834
Columns: 6
$ name       <chr> "Mark Ronson", "Bruno Mars", "OMI", "Felix Jaehn", "Wiz Kh…
$ followers  <dbl> 849349, 27072330, 552534, 1040438, 9412211, 11554345, 4205…
$ popularity <dbl> 79, 89, 69, 82, 88, 84, 82, 85, 83, 75, 83, 84, 74, 95, 91…
$ uri        <chr> "spotify:artist:3hv9jJF3adDNsBSIQDqcjp", "spotify:artist:0…
$ link       <chr> "https://api.spotify.com/v1/artists/3hv9jJF3adDNsBSIQDqcjp…
$ genres     <chr> "dance pop,pop", "dance pop,pop", "dance pop", "dance pop,…

Extracting Info From Playlist Titles

We need to extract the year of the playlist and also whether it was for the Hottest 100 of ARIA Chart. We can give these the flags H100 and ARIA.

playlist_data <- playlist_data %>%
  mutate(year = as.numeric(str_extract(title, "\\d{4}$"))) %>%
  mutate(chart = as.factor(if_else(
    grepl("ARIA", title, fixed = TRUE),
    "ARIA",
    if_else(grepl("Hottest 100", title, fixed = TRUE),
            "H100",
            "OTHER")
  )))

head(playlist_data) %>% tibble()
# A tibble: 6 x 6
  playlist_uri      title      description            url             year chart
  <chr>             <chr>      <chr>                  <chr>          <dbl> <fct>
1 spotify:playlist… ARIA Top … Unavailable: Lionel R… https://api.s…  1970 ARIA 
2 spotify:playlist… ARIA Top … Unavailable: Lally St… https://api.s…  1971 ARIA 
3 spotify:playlist… ARIA Top … Unavailable: Slim New… https://api.s…  1972 ARIA 
4 spotify:playlist… ARIA Top … Unavailable: Jamie Re… https://api.s…  1973 ARIA 
5 spotify:playlist… ARIA Top … Unavailable: Sister J… https://api.s…  1974 ARIA 
6 spotify:playlist… ARIA Top … Unavailable: Bob Huds… https://api.s…  1975 ARIA 

Write To CSV

We can now write the consolidated dataframes to csvs for submission.

write.csv(playlist_data,file = here::here('/Data/cleaned/playlists.csv'))
write.csv(features_data,file = here::here('/Data/cleaned/features.csv'))
write.csv(tracks_data,file = here::here('/Data/cleaned/tracks.csv'))
write.csv(artists_data,file = here::here('/Data/cleaned/artists.csv'))

The Present

Column

The Present Day - 2019 ARIA Chart

Let’s first have a look at what the charts are like in the present day. The 2020 charts have not been finalised yet, so we’re analysing the top 100 songs from the ARIA Chart of 2019.

Top Artists

The table depicts the Top 10 artists from year 2019 that appeared on the Top 100s ARIA chart. From the table we can conclude that Post Malone is the most popular artist with 7 tracks released on 2019. Following Post Melone, Ed Sheeran and Khalid are the next top artists with 6 tracks published in the year 2019. Conversely, artists with 3 tracks published on the year 2019 made to the bottom list of top 10.

Top 10 Genres

Top 10 artists of 2019 in ARIA chart
Artist No. of tracks
Post Malone 7
Ed Sheeran 6
Khalid 6
Billie Eilish 5
5 Seconds of Summer 3
Ariana Grande 3
Drake 3
Marshmello 3
Sam Smith 3
Taylor Swift 3

Top Genres

Evidently, the most prevalent genre in 2019 was pop, doubling the presence of the next most seen genre. A significant proportion of pop in 2019 can be potentially attributed to album releases of popular pop-centric artists such as Taylor Swift, Ed Sheeran, Billie Eilish and Ariana Grande. Each of the aforementioned artists had at least three tracks in the Top 100 Chart, totalling 17 tracks, making up 17% of the Top 100 Chart. Besides pop and its derivatives such as dance pop and post-teen pop, rap and EDM are present in the top 10 with artists such as Post Malone bolstering the Rap genre; and Marshmello and Calvin Harris helping to keep EDM within the Top 10.

Top 10 Genres

Spotify associates multiple genres with each artist, and each track can feature multiple artists with their genres weighted equally for this chart.

Analysing Feature Distribution

We can see from this figure that for songs that were popular in 2019, the median levels of danceability and energy were relatively high, compared to levels of acousticness, instrumentalness, liveness and speechiness. This indicates that Australians generally preferred songs that feel energetic and are suitable for dancing.

When looking at the median level of valence and the whiskers, there seems to be an approximately even split between preference for songs that sound more positive compared to songs that sound more negative. There is greater variance among the middle 50% of the data for valence in particular, and this is slightly skewed towards the lower values, indicating the existience of a slightly higher concentration of popular songs in 2019 that sounded more negative.

Feature Distribution

ARIA vs Hottest 100

Column

Comparing Charts - ARIA vs Hottest 100

The ARIA charts are a record of the highest selling songs and albums in various genres in Australia, and therefore are more representative of the tastes of audiences nation-wide. The Hottest 100 however, is targeted towards the alternative listener - to be representative of the tastes of the cultured youth of Australia. The chart often features more independent and homegrown artists, and over the years has earned a cult status, with the annual day-long broadcast being hearalded a national event.

So, how do these charts compare? We wanted to analyse how these two charts were related. Here’s what the 2019 Hottest 100 looked like:

Artist Popularity Between Charts

We then examined whether the popularity of the artists differ between the two charts. The artist popularity metric is one supplied by Spotify, and refers to a value between 0 and 100, with 100 being the most popular. The artist’s popularity is calculated from the popularity of all the artist’s tracks. We can see a distinct difference in the distributions of artist popularity between the two charts.

This supports the idea that the Hottest 100 is representative of a nicher group of alternative music listeners.

Title

Popularity data as of September 2020.

Feature Distribution Between Charts

We grouped the different features of each track over all time periods into the two separate charts ARIA and H100 to examine if there were any major differences in their distributions. We can see that tracks in the Hottest 100 are less danceable, have more energetic elements and are likely to sound sadder (valence) on average.

Title

Plotted features are calculated by Spotify using this API endpoint and can range between 0 - 1.

Artist Crossover

To analyse artist crossover between charts we looked at what tracks featured on both charts since 1993, the beginning of Hottest 100 playlists. From the playlist and tracks data we determined which tracks featured on both charts and then, grouping by artist, we were able to determine how many tracks each artist had feature on both charts. From the graph above we can see that Post Malone and The offspring have the most tracks featuring on both charts, Post Malone with the highest at 9 tracks featuring on both Charts. Of these top 20 artists it can be noted that many of them can be considered ‘mainstream’ artists, so from this graph it can be observed that it is more likely for an artist’s track to feature on both charts if they are a well-known, mainstream artist while also matching the sound and genre which is popular within the more local Australian Triple J community.

Title

Genres In Common

From the investigation into tracks featuring on both charts and which artists had the most tracks featuring on both we then looked into what genre the top 20 artists were categorised in to. This graph represents how many times each genre was used to describe each track that featured on both charts. Note the genre is not specific to the song but specific to the artist, and an artist can be categorised under many genres.

From the graph we can see that Pop and Rock were the two most common genre descriptions, the third highest being Rap although this was used significantly less times as a genre description than pop and rock. This indicates that most of the artists Referring back to the graph above, Top artists with Featuring tracks, of the top five artists to have featuring tracks only one of those artist’s genre (Khalid) is described as ‘Pop’. This suggests that many of the lower featuring artists are described as pop or rock artists compared to those with higher feature numbers who are described as rap or some form of alternative, but it is harder for artists of rock or alternative genre’s to feature on both charts unless they are of high status or are a large mainstream artist, like Post Malone.

Title

Artists With Highest Features

This graph shows the top artists with the highest total number of charting songs, given that the artist had at least one song feature on each of the two charts at some point in their career. Looking at the graph we can see that there are a number of artists. The graph being positively scewed indicates that a lot of artists have had one or two featuring track on Triple J Hottest 100 and then proceeded to have multiple songs chart on the ARIA Top 100 charts. This is most likely due to the nature of the Hottest 100 chart, as this is a local Australian station where a lot of alternative and up and coming artists’s tracks are played. It is likely that for some of these artists they had a break out song on the Hottest 100 charts and then possibly moved more into the pop genre, releasing more mainstream radio songs resulting in them featuring on Hottest 100 charts less than ARIA charts.

Title

Text Analysis

Column

Analysing Track Titles

Song titles draw a listener’s attention and explain the message behind the entire song. In this section, we will break down what words that are frequently used in song titles and their positive or negative sentiments to analyse if popular songs follow any trends when setting a song name.

30 Most Common Words in Song Titles

We analysed the frequently used words in the top songs every year to observe if there were popular words. The following graph ranks the 30 most common words used in song titles, from the most to least frequents. The word “love” is significantly more commonly used, appearing 365 times in titles. Girl, the second most common word appears only 71 times, highlighting the overwhemling use of the word “love”. An interesting observation is that “girl” was used more frequently then “boy”, appearing twice as much. It also appears that 7 is the most popular number, which may be due to this popularity world-wide as there are 7 days ot the week, 7 continents and other examples of 7 being a lucky or unique number. Night is also more popular than day and black and blue appear to be the more common colours used in titles.

Title

Sentiment Analysis

When the common words are separated according to whether they have positive or negative sentiments, it is observed that most words fall in the positive category. The most common positive word, “love” is used 365 times as previously stated, while “bad”, the most common negative word, appears only 33 times. Even positive words “good” and “beautiful” are most commonly used than “bad”. Through this, we can determine that the most popular songs in Australia, favour more positively associated words in their titles.

Title

References

Column

References